Weighted Block - Asynchronous Relaxation for Gpu - Accelerated Systems ∗

نویسنده

  • HARTWIG ANZT
چکیده

In this paper, we analyze the potential of using weights for block-asynchronous relaxation methods on GPUs. For this purpose, we introduce different weighting techniques similar to those applied in block-smoothers for multigrid methods. Having proven a sufficient convergence condition for the weighted block-asynchronous iteration, we analyze the performance of the algorithms implemented using CUDA and compare them with weighted synchronous relaxation schemes like SOR. For test matrices taken from the University of Florida Matrix Collection we report the convergence behavior and the total runtime for the different weighting techniques. Analyzing the results, we observe that using weights may accelerate the convergence rate of block-asynchronous iteration considerably. This shows the high potential of using weights in block-asynchronous iteration for numerically solving linear systems of equations fulfilling certain convergence conditions. While component-wise relaxation methods are seldom directly applied to linear equation systems, using them as smoother in a multigrid framework they often provide an important contribution to finite element solvers. Since the parallelization potential of the classical smoothers like SOR and Gauss-Seidel is usually very limited, replacing them with block-asynchronous smoothers may have a considerable impact on the overall multigrid performance. Due to the explosion of parallelism in today’s architecture designs, the significance and the need for highly parallel asynchronous smoothers, as the ones described in this work, is expected to grow.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Weighted Block-Asynchronous Iteration on GPU-Accelerated Systems

In this paper, we analyze the potential of using weights for block-asynchronous relaxation methods on GPUs. For this purpose, we introduce different weighting techniques similar to those applied in blocksmoothers for multigrid methods. For test matrices taken from the University of Florida Matrix Collection we report the convergence behavior and the total runtime for the different techniques. A...

متن کامل

GAMER: a GPU-Accelerated Adaptive Mesh Refinement Code for Astrophysics

We present the newly developed code, GAMER (GPU-accelerated Adaptive MEsh Refinement code), which has adopted a novel approach to improve the performance of adaptive mesh refinement (AMR) astrophysical simulations by a large factor with the use of the graphic processing unit (GPU). The AMR implementation is based on a hierarchy of grid patches with an oct-tree data structure. We adopt a three-d...

متن کامل

Block-asynchronous Multigrid Smoothers for GPU-accelerated Systems

This paper explores the need for asynchronous iteration algorithms as smoothers in multigrid methods. The hardware target for the new algorithms is top-of-the-line, highly parallel hybrid architectures – multicore-based systems enhanced with GPGPUs. These architectures are the most likely candidates for future highend supercomputers. To pave the road for their efficient use, we must resolve cha...

متن کامل

GPU-Accelerated Asynchronous Error Correction for Mixed Precision Iterative Refinement

In hardware-aware high performance computing, blockasynchronous iteration and mixed precision iterative refinement are two techniques that are applied to leverage the computing power of SIMD accelerators like GPUs. Although they use a very different approach for this purpose, they share the basic idea of compensating the convergence behaviour of an inferior numerical algorithm by a more efficie...

متن کامل

GPU-accelerated iterative solutions for finite element analysis of soil–structure interaction problems

Soil–structure interaction problems are commonly encountered in engineering practice, and the resulting linear systems of equations are difficult to solve due to the significant material stiffness contrast. In this study, a novel partitioned block preconditioner in conjunction with the Krylov subspace iterative method symmetric quasiminimal residual is proposed to solve such linear equations. T...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012